A Comparative Study of MicroaggregationMethodsJosep M . Mateo - Sanz and Josep Domingo
نویسنده
چکیده
Microaggregation is a statistical disclosure control technique for mi-crodata. Raw microdata (i. e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Fixed-size microaggregation consists of taking xed-size microaggregates (size k). Data-oriented microaggregation (with variable group size) was introduced recently. Regardless of the group size, microaggregates on a multidimensional data set can be formed using univariate techniques on projected data or using multivariate techniques. This paper presents the rst method for multivariate xed-size microaggregation. In addition, a real data set is used to compare the information loss and output data quality of xed-size vs. data-oriented, and univariate vs. multivariate microaggregation.
منابع مشابه
A Comparative Study of Microaggregation Methods
Microaggregation is a statistical disclosure control technique for microdata. Raw microdata (i. e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Fixedsize microaggregation consists of taking fixed-size microaggregates (size k). Data-oriented microaggregation (with vari...
متن کاملComparing SDC Methods for Microdata on the Basis of Information Loss and Disclosure Risk
We present in this paper the first empirical comparison of SDC methods for microdata which encompasses both continuous and categorical microdata. Based on re-identification experiments, we try to optimize the tradeoff between information loss and disclosure risk. First, relevant SDC methods for continuous and categorical microdata are identified. Then generic information loss measures (not targ...
متن کاملRegression for ordinal variables without underlying continuous variables
Several techniques exist nowadays for continuous (i.e. numerical) data analysis and modeling. However, although part of the information gathered by companies, statistical offices and other institutions is numerical, a large part of it is represented using categorical variables in ordinal or nominal scales. Techniques for model building on categorical data are required to take advantage of such ...
متن کاملCurrent Directions in Statistical Data Protection
Statistical data protection can be viewed as a heir of the work that was started on statistical database protection in the 70s and 80s. Massive production of computerized statistics by government agencies combined with an increasing social importance of individual privacy has led to a renewed interest in this topic. This paper summarizes recent activity in statistical data protection, then outl...
متن کامل